Depth Creates No Bad Local Minima
نویسندگان
چکیده
In deep learning, depth, as well as nonlinearity, create non-convex loss surfaces. Then, does depth alone create bad local minima? In this paper, we prove that without nonlinearity, depth alone does not create bad local minima, although it induces non-convex loss surface. Using this insight, we greatly simplify a recently proposed proof to show that all of the local minima of feedforward deep linear neural networks are global minima. Our theoretical result generalizes previous results with fewer assumptions.
منابع مشابه
The loss surface and expressivity of deep convolutional neural networks
We analyze the expressiveness and loss surface of practical deep convolutional neural networks (CNNs) with shared weights. We show that such CNNs produce linearly independent features (and thus linearly separable) at every “wide” layer which has more neurons than the number of training samples. This condition holds e.g. for the VGG network. Furthermore, we provide for such wide CNNs necessary a...
متن کاملLocal minima in training of deep networks
There has been a lot of recent interest in trying to characterize the error surface of deep models. This stems from a long standing question. Given that deep networks are highly nonlinear systems optimized by local gradient methods, why do they not seem to be affected by bad local minima? It is widely believed that training of deep models using gradient methods works so well because the error s...
متن کاملNon-Contact Pulmonary Functional Testing Through an Improved Photometric Stereo Approach
A non-contact computer vision based system is developed for Pulmonary Functional Testing. The unique and novel features of the system are that it views the patients from both front and back and creates a 3D structure of the whole torso. By observing the 3D structure of the torso over time, the amount of air inhaled and exhaled is estimated. The Photometric Stereo method is used to recover local...
متن کاملTowards Effective Low-bitwidth Convolutional Neural Networks
In this work, we aims to effectively train convolutional neural networks with both low-bitwidth weights and low-bitwidth activations. Optimization of a lowprecision network is typically extremely unstable and it is easily trapped in a bad local minima, which results in noticeable accuracy loss. To mitigate this problem, we propose two novel approaches. On one hand, unlike previous methods that ...
متن کاملNo bad local minima: Data independent training error guarantees for multilayer neural networks
We use smoothed analysis techniques to provide guarantees on the training loss of Multilayer Neural Networks (MNNs) at differentiable local minima. Specifically, we examine MNNs with piecewise linear activation functions, quadratic loss and a single output, under mild over-parametrization. We prove that for a MNN with one hidden layer, the training error is zero at every differentiable local mi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1702.08580 شماره
صفحات -
تاریخ انتشار 2017